Deconstruction of Archaeal Genome Depict Strategic Consensus in Core Pathways Coding Sequence Assembly
نویسندگان
چکیده
A comprehensive in silico analysis of 71 species representing the different taxonomic classes and physiological genre of the domain Archaea was performed. These organisms differed in their physiological attributes, particularly oxygen tolerance and energy metabolism. We explored the diversity and similarity in the codon usage pattern in the genes and genomes of these organisms, emphasizing on their core cellular pathways. Our thrust was to figure out whether there is any underlying similarity in the design of core pathways within these organisms. Analyses of codon utilization pattern, construction of hierarchical linear models of codon usage, expression pattern and codon pair preference pointed to the fact that, in the archaea there is a trend towards biased use of synonymous codons in the core cellular pathways and the Nc-plots appeared to display the physiological variations present within the different species. Our analyses revealed that aerobic species of archaea possessed a larger degree of freedom in regulating expression levels than could be accounted for by codon usage bias alone. This feature might be a consequence of their enhanced metabolic activities as a result of their adaptation to the relatively O2-rich environment. Species of archaea, which are related from the taxonomical viewpoint, were found to have striking similarities in their ORF structuring pattern. In the anaerobic species of archaea, codon bias was found to be a major determinant of gene expression. We have also detected a significant difference in the codon pair usage pattern between the whole genome and the genes related to vital cellular pathways, and it was not only species-specific but pathway specific too. This hints towards the structuring of ORFs with better decoding accuracy during translation. Finally, a codon-pathway interaction in shaping the codon design of pathways was observed where the transcription pathway exhibited a significantly different coding frequency signature.
منابع مشابه
Identified Hybrid tRNA Structure Genes in Archaeal Genome
Background: In Archaea, previous studies have revealed the presence of multiple intron-containing tRNAs and split tRNAs. The full unexpurgated analysis of archaeal tRNA genes remains a challenging task in the field of bioinformatics, because of the presence of various types of hidden tRNA genes in archaea. Here, we suggested a computational method that searched for widely separ...
متن کاملGenome Sequence of a Novel Archaeal Fusellovirus Assembled from the Metagenome of a Mexican Hot Spring
The consensus genome sequence of a new member of the family Fuselloviridae designated as SMF1 (Sulfolobales Mexican fusellovirus 1) is presented. The complete circular genome was recovered from a metagenomic study of a Mexican hot spring. SMF1 exhibits an exceptional coding strand bias and a reduced set of fuselloviral core genes.
متن کاملNovel coding regions in four complete archaeal genomes.
In the process of analysing the four available complete archaeal genomes, we have noted that certain regions characterised as 'non-coding' exhibit significant sequence similarity to other protein sequences from Archaea and other species. Using established technology, we have identified a number of potential protein coding regions in these putative 'non-coding' regions. We have detected 524 such...
متن کاملArchaeal ribosomal protein L7 is a functional homolog of the eukaryotic 15.5kD/Snu13p snoRNP core protein.
Recent investigations have identified homologs of eukaryotic box C/D small nucleolar RNAs (snoRNAs) in Archaea termed sRNAs. Archaeal homologs of the box C/D snoRNP core proteins fibrillarin and Nop56/58 have also been identified but a homolog for the eukaryotic 15.5kD snoRNP protein has not been described. Our sequence analysis of archaeal genomes reveals that the highly conserved ribosomal pr...
متن کاملParallelization of MIRA Whole Genome and EST Sequence Assembler
The genome assembly problem is to generate the original DNA sequence of the organism from a large set of short overlapping fragments. MIRA is an open source assembler based on the Overlap Layout Consensus (OLC) graph model which addresses the assembly problem and is widely used by biologists [1,2]. Like other assemblers MIRA takes a long time to compute the assembly for large number of sequence...
متن کامل